Cluster Weighted Beta Regression
نویسندگان
چکیده
The analysis of data assuming values in the real open interval (0;1) is a common issue in quantitative research when the effect of selected variables on the conditional expectation of a percentage or rate is considered. In the literature, various alternative methods to model ratios and percentage data have been proposed (see e.g. Papke and Wooldridge, 1996 and Kieschnick and McCullough, 2003). A possible solution is to transform the dependent variable y, for instance using a logit or a probit transformation, so that it assumes values on the whole real line, and then model the mean of the transformed response as a linear predictor based on a set of covariates applying OLS (Demsez Lehn, 1985) to obtain the parameter estimates. This approach, however, has drawbacks, one of them being the fact that the model parameters cannot be easily interpreted in terms of the average of the original outcome but in terms of the transformed response. Furthermore the assumptions of OLS regression are often not met despite the transformation of the data. An alternative is to use a regression model that assumes that the response variable follows a beta distribution on the interval (0;1), namely ( ):
منابع مشابه
Weighted slant Toep-Hank Operators
A $it{weighted~slant~Toep}$-$it{Hank}$ operator $L_{phi}^{beta}$ with symbol $phiin L^{infty}(beta)$ is an operator on $L^2(beta)$ whose representing matrix consists of all even (odd) columns from a weighted slant Hankel (slant weighted Toeplitz) matrix, $beta={beta_n}_{nin mathbb{Z}}$ be a sequence of positive numbers with $beta_0=1$. A matrix characterization for an operator to be $it{weighte...
متن کاملDesign-adaptive Minimax Local Linear Regression for Longitudinal/clustered Data
This paper studies a weighted local linear regression smoother for longitudinal/clustered data, which takes a form similar to the classical weighted least squares estimate. As a hybrid of the methods of Chen and Jin (2005) and Wang (2003), the proposed local linear smoother maintains the advantages of both methods in computational and theoretical simplicity, variance minimization and bias reduc...
متن کاملDynamics of spread of HIV-I infection in a rural district of Uganda.
OBJECTIVE To define the geographical distribution of HIV infection and the community characteristics associated with HIV prevalence in a rural population of Uganda. DESIGN Seroprevalence survey and interviews of the population aged 13 years and older in 21 randomly selected clusters. SETTING Rural population of Rakai district, south west Uganda. SUBJECTS 1292 adults, of whom 594 men and 6...
متن کاملChemical reactivity predictions: Use of data mining techniques for analyzing regioselective azidolysis of epoxides
Azidolysis of epoxides followed by reduction of the intermediate azido alcohols constitutes a valuable synthetic tool for the construction of beta-amino alcohols, an important chemical functionality occurring in many biologically active compounds of natural origin. However, depending on conditions under which the azidolysis is carried out, two regioisomeric products can be formed, as a conseque...
متن کاملAn O(nlog(n)) Algorithm for Projecting Onto the Ordered Weighted ℓ1 Norm Ball
The ordered weighted `1 (OWL) norm is a newly developed generalization of the Octogonal Shrinkage and Clustering Algorithm for Regression (OSCAR) norm. This norm has desirable statistical properties and can be used to perform simultaneous clustering and regression. In this paper, we show how to compute the projection of an n-dimensional vector onto the OWL norm ball in O(n log(n)) operations. I...
متن کامل